Sequential projective measurements for channel decoding 
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We study the transmission of classical information in quantum channels. We present a decoding 
procedure that is very simple but still achieves the channel capacity. It is used to give an alternative 
straightforward proof that the classical capacity is given by the regularized Holevo bound. This 
procedure uses only projective measurements and is based on successive "yes" /"no" tests only. 
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According to quantum information theory, to trans- 
fer classical signals wc must encode them into the states 
of quantum information carriers, transmit these through 
the (possibly noisy) communication channel, and then 
decode the information at the channel output Fre- 
quently, even if no entanglement between successive in- 
formation carriers is employed in the encoding or is gen- 
erated by the channel, a joint measurement procedure 
is necessary (e.g. see [3) to achieve the capacity of the 
communication line, i.e. the maximum transmission rate 
jer channel use [ij . This is clear from the original proofs 
3, 3 that the classical channel capacity is provided by 
the regularization of the Holevo bound Q: these proofs 
employ a decoding procedure based on detection schemes 
(the 'Prctty-Good-Measurement' or its variants I6l 4l7l) . 
Alternative decoding schemes were also derived in [l8| 
(with a combinatorial approach) and in U 2l| (with an 
application of quantum hypothesis testing, which was in- 
troduced in this context in (lH). Here we present a sim- 
ple decoding procedure which uses only dichotomic pro- 
jective measurements, but which is nonetheless able to 
achieve the channel capacity. 

The main idea is that even if the possible alphabet 
states (i.e. the states of a single information carrier) are 
not orthogonal at the output of the channel, the code- 
words composed of a long sequence of alphabet states 
approach orthogonality asymptotically, as the number of 
letters in each codeword goes to infinity. Thus, one can 
sequentially test whether each codeword is at the output 
of the channel. When one gets the answer "yes", the 
probability of error is small (as the other codewords have 
little overlap with the tested one). When one gets the an- 
swer "no" , the state has been ruined very little and can be 
still employed to further test for the other codewords. To 
reduce the accumulation of errors during a long sequence 
of tests that yield "no" answers, every time a "no" is 
obtained, we have to project the state back to the space 
that contains the typical output of the channel. Sum- 
marizing, the procedure is: 1. test whether the channel 
output is the first codeword; 2. if "yes", we are done, if 
"no" , then project the system into the typical subspace 
and abort with an error if the projection fails; 3. Repeat 



the above procedure for all the other codewords until we 
get a "yes" (or abort with an error if we test all of them 
without getting "yes"). 4. In the end, wc identified the 
codeword that was sent or we had to abort. 

We start reviewing some basic notions on typicality. 
Then, wc prove that the above procedure achieves the 
classical capacity of the channel. An alternative (more 
formal) proof that refers to this same method is presented 
in [2^ by using a decoding strategy in which the "yes/no" 
measurements discriminates only among the typical sub- 
spaces of the codewords. An application of our scheme to 
communication over Gaussian bosonic channels (2^ will 
be presented in Ref. [25 1. 

Definitions and review: — For notational simplicity we 
will consider codewords composed of unentangled states. 
For general channels, entangled codewords must be used 
to achieve capacity [i^, but the extension of our the- 
ory to this case is straightforward (replacing the Holevo 
bound with its regularized version) . 

Consider a quantum channel that is fed with a let- 
ter j from a classical alphabet with probability pj. The 
letter j is encoded into a state of the information car- 
riers which is evolved by the channel into an output 
Pj = J2kPk\j\k)j{f'\^ where j(k'\k)j ^ Sk'k- Hence, the 
average output is 
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where \k)j and \k) are the eigenvectors of the jth output- 
alphabet density matrix and of the average output re- 
spectively. The subtleties of quantum channel decoding 
arise because the pj typically commute neither with each 
other nor with p. The Holevo- Shumacher- Westmoreland 
(HSW) theorem 0, i] implies that wc can send classical 
information reliably down the channel at a rate (bits per 
channel use) given by the Holevo quantity (Hj 
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where S{-) = — Tr[(-) log2(-)] is the von Neumann en- 
tropy. This rate can be asymptotically attained in 
the multi-channel uses scenario as lim„_i.oo(log2 -/VrO/'^i 
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where a set C„ of Nn codewords j = (ji, ■ • ■ , jn) formed 
by long sequences of the letters j are used to reliably 
transfer Nn distinct classical messages. Similarly to 
the Shannon random-coding theory 27|, the codewords 
j € Cn can be chosen at random among the typical se- 
quences generated by the probability pj, in which each 
letter j of the alphabet occurs approximately pjU times. 
As mentioned in the introduction, the HSW theorem uses 
the 'Pretty-Good-Measurement' procedure to decode the 
codewords of C„ at the output of the channel. We will 
now show that a sequence of binary projective measure- 
ments suffices pij . 

Sequential measurements for channel decoding: — The 
channel output state pj = pj-^^ (g) ■ • ■ pj^ associated to a 

generic typical sequence j = (ji, • ■ ■ ,jn) possesses a typi- 
cal subspace spanned by the vectors |fci)ji ■ • • \kn)j„ = 

\k)j, where \k)j occurs approximately PjPk\j'n ~ Pjki^ 
times, e.g. see Ref. The subspace Hj has dimensions 

~ 2"^jP^'^(''^^ independent of the input J e d. More- 
over, a typical output subspace Ti and a projector P onto 
it exist such that, for any e > and sufficiently large n 



Tr p > 1 — e 



(3) 



where p = Pp®- ■ ■ ® pP is the projection of the ?i-output 
average density matrix onto %. Notice that % and the 
T-Lfs, in general differ. Typicality for % implies that, for 
(5 > and sufficiently large n, the eigenvalues \i of p and 
the dimension of % are bounded as^, 01 

A, s$ 2-"('5('')^'') , (4) 
# nonzero eigenvalues = dim(H) ^ 2"('^('')+''' . (5) 

Define then the operator 

~P = P{ E P]Pk\j\k)-{k\)P^-p. (6) 

where the inequality follows because the summation is 
only restricted to the j's that are typical sequences of 
the classical source, and to the states \k)^ which span the 

typical subspace of the j-th output. [Without these limi- 
tations, the inequality would be replaced by an equality.] 
Consequently the maximum eigenvalue of p is no greater 
than that of p while the number of nonzero eigenvalues 
of p cannot be greater than those of p, i.e. Eqs. ([3|)-([5|) 
apply also to p. 

Now we come to our main result. To distinguish be- 
tween the Nn distinct codewords of C„, we perform se- 
quential von Neumann measurements corresponding to 
projections onto the possible outputs to find the 
channel input (as shown in [i^ these can also be replaced 
by joint projectors on the spaces "Hj). In between these 
measurements, we perform von Neumann measurements 
that project onto the typical output subspace %. 



We will show that as long as the rate at which we 
send information down the channel is bounded above 
by the Holevo quantity ([2]), these measurements iden- 
tify the proper input to the channel with probability 
one in the limit that the number of uses of the chan- 
nel goes to infinity. That is, we send information down 
the channel at a rate R smaller than so that there 
are Nn — 2""^ possible randomly selected codewords j 
that could be sent down over n uses. Each codeword 
gives rise to ~ 2" '^'-''^ ^ possible typical outputs 
As always with Shannon-like random coding arguments 
[2^, our set of possible outputs only occupy a fraction 
of the full output space. This sparseness of the 
actual outputs in the full space is the key to obtaining 
asymptotic zero error probability: all our error probabil- 
ities will scale as 2""*^^^^^. 

The codeword sent down the channel is some typical 
sequence j, which yields some typical output with 
probability P^y We begin with a von Neumann mea- 
surement corresponding to projectors P, 1 — P to check 
whether the output lies in the typical subspace %. From 
Eq. ([3]) wc can conclude that for any e > 0, for suffi- 
ciently large n, this measurement yields the result "yes" 
with probability larger than 1 — e. We follow this with a 
binary projective measurement with projectors 
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to check whether the input was ji and the output was fci . 
If this measurement yields the result "yes" , we conclude 
that the input was indeed ji . Usually, however, this mea- 
surement yields the result "no". In this case, we perform 
another measurement to check for typicality, and move 
on to a second trial output state, e.g., \k2)j^ ■ If this mea- 
surement yields the result "yes", we conclude that the 
input was ji. Usually, of course, the measurement yields 
the result "no" , and so we project again and move on to a 
third trial output state, Ifca)^^ etc. Having exhausted the 
0{2^^kPkS{pk)'^ typical output states from the codeword 
ji, wc turn to the typical output states from the input 
j2, then js, and so on, moving through the Nn — 2"^ 
codewords until we eventually find a match. The maxi- 
mum number of measurements that must be performed 
is hence 



(8) 



The probability amplitude that after m trials without 
finding the correct state, we find it at the m + I'th trial 
can then be expressed as 

A„,{yes) = j(fc|P(l - PijP • • • P(a - P,, )P|fc)j , (9) 

where for (7 = 1, • • • , 7ti the operators Pi^ represent the 
first m elements P^ |j that compose the decoding se- 
quence of projectors. The error probability Perrii^k) of 
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mistaking the vector can then be bounded by consid- 
ering the worst case scenario in which the codeword sent 
is the last one tested in the sequence. Since this is the 
worst that can happen, \AMiyss)\ with m = M, is the 
smallest possible, so that Perrij,k) ^ 1 — |^j\f(yes)p. 
Recall that the input codewords j are randomly selected 
from the set of typical input sequences, and fc's are typical 
output sequences. Then, the average error probability for 
a randomly selected set of input codewords can bounded 
as (Perr) 1 - ( (z/es) | 1 - | (^Af (yes) ) | ' . Here 
(• • • ) represents the average over all possible codewords 
of a given selected codcbook C„ and the averaging over 
all possible codebooks of codewords. The Cauchy-Swarz 
inequality (|^M(2/es)p) ^ \{AM{yes))\'^ was employed. 
The last term can be evaluated as 

(^™(2/es)) =Tr[p(l-^^,„P,„)p••• 
^m 

p(t-Y,n,,P,,)Pp\ =Tr[(P-p) 

m 



k=0 
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where tt^ stands for the probability PjP^p and where we 
used ([6]) and jT]) to write p = Y^g niPPeP. To prove 
the optimality of our decoding, it is hence sufficient to 
show that {Am{yss)) ^ 1 even when the number m of 
measurements is equal to its maximum possible value M 
of Eq. dH). Consider then Eqs. (g]) and ^ which imply 
the inequalities 



Use this and Eq. (jS]) to rewrite Eq. (fTO| as 



(^™(ye.))^Trp + 5^('r)(-l)'Tr 
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where 7 = 22"* [(1 + Q)"" - 1], with 
S{p) > 6, for large n we can write 



2n[-S{p)+6]^ 



(1 + Cn)^ 



1 



1 ^ mCn 



(13) 



Hence, 7 is asymptotically negligible as long as 22"* m C„ 
is vanishing for n ^ 00. This yields the constraint 



2"(S(P)-5) for aU 



(14) 



In particular, it must hold for A/, the largest value of 
771 given in (|8]). Imposing this, the decoding procedure 
yields a vanishing error probability if the rate R satisfies 



R<x-S, 



as required by the Holevo bound 

Summarizing, we have shown that under the condi- 
tion (fTS)) the average amplitude {Am{yes)) of identifying 
the correct codeword is asymptotically close to 1 even in 
the worst case in which we had to check over all the other 
codewords 777 = M. This implies that the average proba- 
bility of error in identifying the codeword asymptotically 
vanishes. In other words, the procedure works even when 
the measurements are chosen so that the codeword sent is 
the last one tested in the sequence of tests. Note that the 
same results presented here can be obtained also starting 
from the direct calculation of the error probability |23| 
(instead of using the probability amplitude). 

We conclude by noting that from Eq. (|9]) one imme- 
diately sees that the POVM {Eg} relative to the global 
decoding procedure is 



El = pPiP; E2 = P(i - Pi)PP2P(a - Pi)P; 

Ee = P(l - Pi)P(l - P2)P • • • P(l - Pf-i)P 



xP,---(l-Pi)P; Eo 
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where Pi is defined as in ([7]) and Eq is the "abort" re- 
sult. We gave a simple realization of this POVM using 
sequential "yes/no" projections, but different realizations 
may be possible. It is an alternative to the conventional 
Pretty-Good-Measurement. Note also that, with the ex- 
ception of Eq, all the operators in this POVM are simply 
projections onto pure states or on their orthogonal com- 
plement. Such sequence of projective measurements is 
'asymptotically unentangling' in the sense that the out- 
put state departs at most infinitcsimally from its origi- 
nal separable form throughout the entire decoding pro- 
cedure. This clarifies that the role of entanglement in 
the decoding is analogous to [29} : namely, increasing the 
distinguishability of a multi-partite set of states that are 
not orthogonal when considered by separate parties that 
do not employ entanglement. 

Conclusions: — Using projective measurements in a 
sequential fashion, we gave a new proof that it is possi- 
ble to attain the Holevo capacity when a noisy quantum 
channel is used to transmit classical information. Such 
measurements provide an alternative to the usual Pretty- 
Good-Mcasurements for channel decoding, and can be 
used in many of the same situations. In particular, an 
analogous procedure can be used to decode channels that 
transmit quantum inform ation, to approach the coherent 
information limit 



- 32|. This follows simply from the 
observation [32| that the transfer of quantum messages 
over the channel can be formally treated as a transfer of 
classical messages imposing an extra constraint of privacy 
in the signaling. 
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